skip to main content


Search for: All records

Creators/Authors contains: "Porter, Donald"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Storage capacity demand is projected to grow exponentially in the coming decade and so will its contribution to the overall carbon footprint of computing devices. In recent years, cloud providers and device vendors have substantially reduced their carbon impact through improved power consumption and product distribution. However, by 2030, the manufacturing of flash-based storage devices will account for 1.7% of carbon emissions in the world. Therefore, reducing production-related carbon emissions of storage is key to sustainability in computing devices. We present Sustainability-Oriented Storage (SOS), a new host-device co-design for personal storage devices, which opportunistically improves storage sustainability by: (1) targeting widely-produced flash-based personal storage devices; (2) reducing hardware production through optimizing bit density in existing materials, up to 50%; and (3) exploiting an underutilized gap between the effective lifespan of personal devices and longer lifespan of their underlying flash. SOS automatically stores low-priority files, occupying most personal storage capacities, on high-density flash memories, currently designated for nearline storage. To avoid data loss, low-priority files are allowed to slightly degrade in quality over time. Switching to high-density memories, which maximize production material utilization, reduces the overall carbon footprint of personal storage devices. 
    more » « less
    Free, publicly-accessible full text available June 22, 2024
  2. null (Ed.)
    Peripheral devices like SSDs are growing more complex, to the point they are effectively small computers themselves. Our position is that this trend creates a new kind of attack vector, where untrusted software could use peripherals strictly as intended to accomplish unintended goals. To exemplify, we set out to rowhammer the DRAM component of a simplified host-side FTL, issuing regular I/O requests that manage to flip bits in a way that triggers sensitive information leakage. We conclude that such attacks might soon be feasible, and we argue that systems need principled approaches for securing peripherals against them. 
    more » « less
  3. null (Ed.)
    Ransomware has been a growing threat to end-users in the past few years. In response, there is also a burgeoning market for anti-ransomware defense products, as well as research prototypes that explore more advanced, behavioral analyses. Intuitively, ransomware should be amenable to identification through behavioral analysis, since ransomware recursively walks a user’s files and encrypts them, overwriting or deleting the plaintext. This paper contributes a study of the effectiveness of these behavior-based ransomware defenses, from both commercial products and academic proposals. We drive the study with a dead simple ransomware, augmented with a number of both straightforward and new evasion techniques. Surprisingly, our results indicate that most commercial products are strikingly ineffective. Ten out of 15 commercial products could not detect our simple ransomware without any evasive techniques; most of the rest were evaded and able to ransom user data with some combination of simple techniques. Only one tool appears to correctly identify our ransomware, but suffers from staggering false positives, including flagging Windows Explorer, Firefox, and Notepad as ransomware during routine operation. Our paper identifies a number of techniques to manipulate entropy to match the original file. The paper further shows that partial encryption, of as little as 3–5% of a file’s data is sufficient to ransom most file formats. Finally, we show that a combination of these techniques can render an aggregate malice score that is well below that of a Linux kernel compile. In summary, these results indicate that it is highly likely that ransomware will be able to adapt its behavior to fit within the range of expected benign behaviors, avoiding detection even by future generations of behavioral ransomware detectors. 
    more » « less
  4. null (Ed.)
    Storage devices have complex performance profiles, including costs to initiate IOs (e.g., seek times in hard drives), parallelism and bank conflicts (in SSDs), costs to transfer data, and firmware-internal operations. The Disk-access Machine (DAM) model simplifies reality by assuming that storage devices transfer data in blocks of size B and that all transfers have unit cost. Despite its simplifications, the DAM model is reasonably accurate. In fact, if B is set to the half-bandwidth point, where the latency and bandwidth of the hardware are equal, then the DAM approximates the IO cost on any hardware to within a factor of 2. Furthermore, the DAM model explains the popularity of B-trees in the 1970s and the current popularity of B ɛ -trees and log-structured merge trees. But it fails to explain why some B-trees use small nodes, whereas all B ɛ -trees use large nodes. In a DAM, all IOs, and hence all nodes, are the same size. In this article, we show that the affine and PDAM models, which are small refinements of the DAM model, yield a surprisingly large improvement in predictability without sacrificing ease of use. We present benchmarks on a large collection of storage devices showing that the affine and PDAM models give good approximations of the performance characteristics of hard drives and SSDs, respectively. We show that the affine model explains node-size choices in B-trees and B ɛ -trees. Furthermore, the models predict that B-trees are highly sensitive to variations in the node size, whereas B ɛ -trees are much less sensitive. These predictions are born out empirically. Finally, we show that in both the affine and PDAM models, it pays to organize data structures to exploit varying IO size. In the affine model, B ɛ -trees can be optimized so that all operations are simultaneously optimal, even up to lower-order terms. In the PDAM model, B ɛ -trees (or B-trees) can be organized so that both sequential and concurrent workloads are handled efficiently. We conclude that the DAM model is useful as a first cut when designing or analyzing an algorithm or data structure but the affine and PDAM models enable the algorithm designer to optimize parameter choices and fill in design details. 
    more » « less
  5. null (Ed.)
    Making logical copies, or clones, of files and directories is critical to many real-world applications and workflows, including backups, virtual machines, and containers. An ideal clone implementation meets the following performance goals: (1) creating the clone has low latency; (2) reads are fast in all versions (i.e., spatial locality is always maintained, even after modifications); (3) writes are fast in all versions; (4) the overall system is space efficient. Implementing a clone operation that realizes all four properties, which we call a nimble clone , is a long-standing open problem. This article describes nimble clones in B-ϵ-tree File System (BetrFS), an open-source, full-path-indexed, and write-optimized file system. The key observation behind our work is that standard copy-on-write heuristics can be too coarse to be space efficient, or too fine-grained to preserve locality. On the other hand, a write-optimized key-value store, such as a Bε-tree or an log-structured merge-tree (LSM)-tree, can decouple the logical application of updates from the granularity at which data is physically copied. In our write-optimized clone implementation, data sharing among clones is only broken when a clone has changed enough to warrant making a copy, a policy we call copy-on-abundant-write . We demonstrate that the algorithmic work needed to batch and amortize the cost of BetrFS clone operations does not erode the performance advantages of baseline BetrFS; BetrFS performance even improves in a few cases. BetrFS cloning is efficient; for example, when using the clone operation for container creation, BetrFS outperforms a simple recursive copy by up to two orders-of-magnitude and outperforms file systems that have specialized Linux Containers (LXC) backends by 3--4×. 
    more » « less
  6. Making logical copies, or clones, of files and directories is critical to many real-world applications and work- flows, including backups, virtual machines, and containers. An ideal clone implementation meets the follow- ing performance goals: (1) creating the clone has low latency; (2) reads are fast in all versions (i.e., spatial locality is always maintained, even after modifications); (3) writes are fast in all versions; (4) the overall sys- tem is space efficient. Implementing a clone operation that realizes all four properties, which we call a nimble clone, is a long-standing open problem. This article describes nimble clones in B-ε-tree File System (BetrFS), an open-source, full-path-indexed, and write-optimized file system. The key observation behind our work is that standard copy-on-write heuristics can be too coarse to be space efficient, or too fine-grained to preserve locality. On the other hand, a write- optimized key-value store, such as a Bε -tree or an log-structured merge-tree (LSM)-tree, can decouple the logical application of updates from the granularity at which data is physically copied. In our write-optimized clone implementation, data sharing among clones is only broken when a clone has changed enough to warrant making a copy, a policy we call copy-on-abundant-write. We demonstrate that the algorithmic work needed to batch and amortize the cost of BetrFS clone operations does not erode the performance advantages of baseline BetrFS; BetrFS performance even improves in a few cases. BetrFS cloning is efficient; for example, when using the clone operation for container creation, BetrFSoutperforms a simple recursive copy by up to two orders-of-magnitude and outperforms file systems that have specialized Linux Containers (LXC) backends by 3–4×. 
    more » « less
  7. Storage devices have complex performance profiles, including costs to initiate IOs (e.g., seek times in hard 15 drives), parallelism and bank conflicts (in SSDs), costs to transfer data, and firmware-internal operations. The Disk-access Machine (DAM) model simplifies reality by assuming that storage devices transfer data in blocks of size B and that all transfers have unit cost. Despite its simplifications, the DAM model is reasonably accurate. In fact, if B is set to the half-bandwidth point, where the latency and bandwidth of the hardware are equal, then the DAM approximates the IO cost on any hardware to within a factor of 2. Furthermore, the DAM model explains the popularity of B-trees in the 1970s and the current popularity of Bε -trees and log-structured merge trees. But it fails to explain why some B-trees use small nodes, whereas all Bε -trees use large nodes. In a DAM, all IOs, and hence all nodes, are the same size. In this article, we show that the affine and PDAM models, which are small refinements of the DAM model, yield a surprisingly large improvement in predictability without sacrificing ease of use. We present benchmarks on a large collection of storage devices showing that the affine and PDAM models give good approximations of the performance characteristics of hard drives and SSDs, respectively. We show that the affine model explains node-size choices in B-trees and Bε -trees. Furthermore, the models predict that B-trees are highly sensitive to variations in the node size, whereas Bε -trees are much less sensitive. These predictions are born out empirically. Finally, we show that in both the affine and PDAM models, it pays to organize data structures to exploit varying IO size. In the affine model, Bε -trees can be optimized so that all operations are simultaneously optimal, even up to lower-order terms. In the PDAM model, Bε -trees (or B-trees) can be organized so that both sequential and concurrent workloads are handled efficiently. We conclude that the DAM model is useful as a first cut when designing or analyzing an algorithm or data structure but the affine and PDAM models enable the algorithm designer to optimize parameter choices and fill in design details. 
    more » « less
  8. Wearable devices, such as smart watches and fitness trackers are growing in popularity, creating a need for application developers to adapt or extend a UI, typically from a smartphone, onto these devices. Wearables generally have a smaller form factor than a phone; thus, porting an app to the watch necessarily involves reworking the UI. An open problem is identifying best practices for adapting UIs to wearable devices. This paper contributes a study and data set of the state of practice in UI adaptation for wearables. We automatically extract UI designs from a set of 101 popular Android apps that have both a phone and watch version, and manually label how each UI element, as well as how screens in the app, are translated from the phone to the wearable. The paper identifies trends in adaptation strategies and presents design guidelines. We expect that the UI adaptation strategies identified in this paper can have wide-ranging impacts for future research and identifying best practices in this space, such as grounding future user studies that evaluate which strategies improve user satisfaction or automatically adapting UIs. 
    more » « less